Overview

Dataset statistics

Number of variables17
Number of observations173659
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory19.9 MiB
Average record size in memory120.0 B

Variable types

Numeric8
Categorical6
Unsupported3

Alerts

OrdenDeCompraID has a high cardinality: 1472 distinct values High cardinality
FechaTransaccion has a high cardinality: 2155 distinct values High cardinality
NombreProducto has a high cardinality: 227 distinct values High cardinality
TransaccionProductoID is highly correlated with InvoiceID and 1 other fieldsHigh correlation
ProductoID is highly correlated with Cantidad and 1 other fieldsHigh correlation
InvoiceID is highly correlated with TransaccionProductoID and 1 other fieldsHigh correlation
Cantidad is highly correlated with ProductoID and 3 other fieldsHigh correlation
ValorTotal_Con_PrecioUnitario is highly correlated with Cantidad and 2 other fieldsHigh correlation
ValorTotal_Con_PrecioRecomendado is highly correlated with Cantidad and 2 other fieldsHigh correlation
Año is highly correlated with TransaccionProductoID and 1 other fieldsHigh correlation
Cantidad_abs is highly correlated with ProductoID and 3 other fieldsHigh correlation
TransaccionProductoID is highly correlated with InvoiceID and 1 other fieldsHigh correlation
TipoTransaccionID is highly correlated with Cantidad and 3 other fieldsHigh correlation
InvoiceID is highly correlated with TransaccionProductoID and 1 other fieldsHigh correlation
Cantidad is highly correlated with TipoTransaccionID and 3 other fieldsHigh correlation
ValorTotal_Con_PrecioUnitario is highly correlated with TipoTransaccionID and 3 other fieldsHigh correlation
ValorTotal_Con_PrecioRecomendado is highly correlated with TipoTransaccionID and 3 other fieldsHigh correlation
Año is highly correlated with TransaccionProductoID and 1 other fieldsHigh correlation
Cantidad_abs is highly correlated with TipoTransaccionID and 3 other fieldsHigh correlation
TransaccionProductoID is highly correlated with InvoiceID and 1 other fieldsHigh correlation
InvoiceID is highly correlated with TransaccionProductoID and 1 other fieldsHigh correlation
Cantidad is highly correlated with Cantidad_absHigh correlation
ValorTotal_Con_PrecioUnitario is highly correlated with ValorTotal_Con_PrecioRecomendado and 1 other fieldsHigh correlation
ValorTotal_Con_PrecioRecomendado is highly correlated with ValorTotal_Con_PrecioUnitario and 1 other fieldsHigh correlation
Año is highly correlated with TransaccionProductoID and 1 other fieldsHigh correlation
Cantidad_abs is highly correlated with Cantidad and 2 other fieldsHigh correlation
ProveedorID is highly correlated with TipoTransaccionIDHigh correlation
TipoTransaccionID is highly correlated with ProveedorIDHigh correlation
TransaccionProductoID is highly correlated with InvoiceID and 1 other fieldsHigh correlation
TipoTransaccionID is highly correlated with InvoiceID and 5 other fieldsHigh correlation
InvoiceID is highly correlated with TransaccionProductoID and 7 other fieldsHigh correlation
ProveedorID is highly correlated with TipoTransaccionID and 5 other fieldsHigh correlation
Cantidad is highly correlated with TipoTransaccionID and 5 other fieldsHigh correlation
ValorTotal_Con_PrecioUnitario is highly correlated with TipoTransaccionID and 5 other fieldsHigh correlation
ValorTotal_Con_PrecioRecomendado is highly correlated with TipoTransaccionID and 5 other fieldsHigh correlation
Año is highly correlated with TransaccionProductoID and 1 other fieldsHigh correlation
Cantidad_abs is highly correlated with TipoTransaccionID and 5 other fieldsHigh correlation
TransaccionProductoID is uniformly distributed Uniform
TransaccionProductoID has unique values Unique
FechaTransaccionAjustada is an unsupported type, check if it needs cleaning or further analysis Unsupported
PrecioUnitario is an unsupported type, check if it needs cleaning or further analysis Unsupported
PrecioRecomendado is an unsupported type, check if it needs cleaning or further analysis Unsupported
ClienteID has 6072 (3.5%) zeros Zeros
InvoiceID has 6072 (3.5%) zeros Zeros

Reproduction

Analysis started2024-06-17 23:18:13.054072
Analysis finished2024-06-17 23:18:39.438385
Duration26.38 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

TransaccionProductoID
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIFORM
UNIQUE

Distinct173659
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean212512.8321
Minimum89146
Maximum336251
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size678.5 KiB
2024-06-17T18:18:39.542952image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum89146
5-th percentile101474.9
Q1150707.5
median212419
Q3274254.5
95-th percentile323926.1
Maximum336251
Range247105
Interquartile range (IQR)123547

Descriptive statistics

Standard deviation71345.47913
Coefficient of variation (CV)0.335723158
Kurtosis-1.199871973
Mean212512.8321
Median Absolute Deviation (MAD)61764
Skewness0.003294818852
Sum3.690476592 × 1010
Variance5090177393
MonotonicityNot monotonic
2024-06-17T18:18:39.697975image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2211111
 
< 0.1%
1117041
 
< 0.1%
2796851
 
< 0.1%
1926181
 
< 0.1%
3193101
 
< 0.1%
1098121
 
< 0.1%
1598381
 
< 0.1%
1899961
 
< 0.1%
2544691
 
< 0.1%
2912101
 
< 0.1%
Other values (173649)173649
> 99.9%
ValueCountFrequency (%)
891461
< 0.1%
891471
< 0.1%
891481
< 0.1%
891491
< 0.1%
891501
< 0.1%
891511
< 0.1%
891531
< 0.1%
891541
< 0.1%
891811
< 0.1%
891831
< 0.1%
ValueCountFrequency (%)
3362511
< 0.1%
3362501
< 0.1%
3362491
< 0.1%
3362481
< 0.1%
3362471
< 0.1%
3362451
< 0.1%
3362441
< 0.1%
3362431
< 0.1%
3362421
< 0.1%
3362411
< 0.1%

ProductoID
Real number (ℝ≥0)

HIGH CORRELATION

Distinct227
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean110.6737745
Minimum1
Maximum227
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size678.5 KiB
2024-06-17T18:18:39.857781image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile12
Q156
median110
Q3166
95-th percentile210
Maximum227
Range226
Interquartile range (IQR)110

Descriptive statistics

Standard deviation63.52289916
Coefficient of variation (CV)0.5739652367
Kurtosis-1.192928697
Mean110.6737745
Median Absolute Deviation (MAD)55
Skewness0.01168125345
Sum19219497
Variance4035.158718
MonotonicityNot monotonic
2024-06-17T18:18:39.990499image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
801274
 
0.7%
951254
 
0.7%
1841150
 
0.7%
861110
 
0.6%
1931109
 
0.6%
771101
 
0.6%
2041096
 
0.6%
781089
 
0.6%
981087
 
0.6%
214841
 
0.5%
Other values (217)162548
93.6%
ValueCountFrequency (%)
1767
0.4%
2790
0.5%
3763
0.4%
4785
0.5%
5751
0.4%
6758
0.4%
7760
0.4%
8751
0.4%
9773
0.4%
10782
0.5%
ValueCountFrequency (%)
227118
 
0.1%
226151
 
0.1%
225127
 
0.1%
224118
 
0.1%
223117
 
0.1%
222149
 
0.1%
221131
 
0.1%
220135
 
0.1%
219794
0.5%
218749
0.4%

TipoTransaccionID
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.3 MiB
10
167587 
11
 
6035
12
 
37

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters347318
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row10
2nd row10
3rd row10
4th row10
5th row10

Common Values

ValueCountFrequency (%)
10167587
96.5%
116035
 
3.5%
1237
 
< 0.1%

Length

2024-06-17T18:18:40.106111image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2024-06-17T18:18:40.221678image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
10167587
96.5%
116035
 
3.5%
1237
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
1179694
51.7%
0167587
48.3%
237
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number347318
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1179694
51.7%
0167587
48.3%
237
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common347318
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1179694
51.7%
0167587
48.3%
237
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII347318
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1179694
51.7%
0167587
48.3%
237
 
< 0.1%

ClienteID
Real number (ℝ≥0)

ZEROS

Distinct664
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean517.2418821
Minimum0
Maximum1061
Zeros6072
Zeros (%)3.5%
Negative0
Negative (%)0.0%
Memory size1.3 MiB
2024-06-17T18:18:40.326557image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile10
Q1145
median511
Q3877
95-th percentile1010
Maximum1061
Range1061
Interquartile range (IQR)732

Descriptive statistics

Standard deviation353.4912165
Coefficient of variation (CV)0.6834156875
Kurtosis-1.451732608
Mean517.2418821
Median Absolute Deviation (MAD)366
Skewness-0.02666969492
Sum89823708
Variance124956.0401
MonotonicityNot monotonic
2024-06-17T18:18:40.452476image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
06072
 
3.5%
980352
 
0.2%
810341
 
0.2%
185337
 
0.2%
558336
 
0.2%
149335
 
0.2%
954334
 
0.2%
804333
 
0.2%
953331
 
0.2%
118329
 
0.2%
Other values (654)164559
94.8%
ValueCountFrequency (%)
06072
3.5%
1295
 
0.2%
2283
 
0.2%
3302
 
0.2%
4209
 
0.1%
5286
 
0.2%
6255
 
0.1%
7266
 
0.2%
8198
 
0.1%
9275
 
0.2%
ValueCountFrequency (%)
106129
 
< 0.1%
106015
 
< 0.1%
105930
 
< 0.1%
105850
< 0.1%
105755
< 0.1%
105645
< 0.1%
105578
< 0.1%
105484
< 0.1%
105371
< 0.1%
105292
0.1%

InvoiceID
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct51831
Distinct (%)29.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean42972.01054
Minimum0
Maximum70510
Zeros6072
Zeros (%)3.5%
Negative0
Negative (%)0.0%
Memory size1.3 MiB
2024-06-17T18:18:40.575378image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile19484
Q130149
median43565
Q356988
95-th percentile67835
Maximum70510
Range70510
Interquartile range (IQR)26839

Descriptive statistics

Standard deviation16823.71872
Coefficient of variation (CV)0.3915041094
Kurtosis-0.3824902
Mean42972.01054
Median Absolute Deviation (MAD)13420
Skewness-0.3662493315
Sum7462476379
Variance283037511.5
MonotonicityNot monotonic
2024-06-17T18:18:40.703265image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
06072
 
3.5%
305315
 
< 0.1%
197915
 
< 0.1%
686985
 
< 0.1%
310995
 
< 0.1%
531885
 
< 0.1%
357605
 
< 0.1%
310475
 
< 0.1%
518005
 
< 0.1%
572295
 
< 0.1%
Other values (51821)167542
96.5%
ValueCountFrequency (%)
06072
3.5%
186811
 
< 0.1%
186821
 
< 0.1%
186831
 
< 0.1%
186843
 
< 0.1%
186853
 
< 0.1%
186865
 
< 0.1%
186874
 
< 0.1%
186884
 
< 0.1%
186893
 
< 0.1%
ValueCountFrequency (%)
705105
< 0.1%
705095
< 0.1%
705085
< 0.1%
705073
< 0.1%
705064
< 0.1%
705053
< 0.1%
705043
< 0.1%
705035
< 0.1%
705022
 
< 0.1%
705013
< 0.1%

ProveedorID
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.3 MiB
167624 
4.0
 
4103
7.0
 
1922
1.0
 
10

Length

Max length3
Median length0
Mean length0.104256042
Min length0

Characters and Unicode

Total characters18105
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
167624
96.5%
4.04103
 
2.4%
7.01922
 
1.1%
1.010
 
< 0.1%

Length

2024-06-17T18:18:40.823380image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2024-06-17T18:18:40.934565image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
4.04103
68.0%
7.01922
31.8%
1.010
 
0.2%

Most occurring characters

ValueCountFrequency (%)
.6035
33.3%
06035
33.3%
44103
22.7%
71922
 
10.6%
110
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number12070
66.7%
Other Punctuation6035
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
06035
50.0%
44103
34.0%
71922
 
15.9%
110
 
0.1%
Other Punctuation
ValueCountFrequency (%)
.6035
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common18105
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.6035
33.3%
06035
33.3%
44103
22.7%
71922
 
10.6%
110
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII18105
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.6035
33.3%
06035
33.3%
44103
22.7%
71922
 
10.6%
110
 
0.1%

OrdenDeCompraID
Categorical

HIGH CARDINALITY

Distinct1472
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size1.3 MiB
167624 
1292.0
 
6
1299.0
 
6
1301.0
 
6
1307.0
 
6
Other values (1467)
 
6011

Length

Max length6
Median length0
Mean length0.1994656194
Min length0

Characters and Unicode

Total characters34639
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique76 ?
Unique (%)< 0.1%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
167624
96.5%
1292.06
 
< 0.1%
1299.06
 
< 0.1%
1301.06
 
< 0.1%
1307.06
 
< 0.1%
1311.06
 
< 0.1%
1313.06
 
< 0.1%
1315.06
 
< 0.1%
1327.06
 
< 0.1%
1333.06
 
< 0.1%
Other values (1462)5981
 
3.4%

Length

2024-06-17T18:18:41.027564image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1292.06
 
0.1%
1089.06
 
0.1%
860.06
 
0.1%
1877.06
 
0.1%
1930.06
 
0.1%
1708.06
 
0.1%
787.06
 
0.1%
1022.06
 
0.1%
1331.06
 
0.1%
1630.06
 
0.1%
Other values (1461)5975
99.0%

Most occurring characters

ValueCountFrequency (%)
07981
23.0%
.6035
17.4%
15786
16.7%
62024
 
5.8%
92016
 
5.8%
82000
 
5.8%
21954
 
5.6%
71954
 
5.6%
41652
 
4.8%
31620
 
4.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number28604
82.6%
Other Punctuation6035
 
17.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
07981
27.9%
15786
20.2%
62024
 
7.1%
92016
 
7.0%
82000
 
7.0%
21954
 
6.8%
71954
 
6.8%
41652
 
5.8%
31620
 
5.7%
51617
 
5.7%
Other Punctuation
ValueCountFrequency (%)
.6035
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common34639
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
07981
23.0%
.6035
17.4%
15786
16.7%
62024
 
5.8%
92016
 
5.8%
82000
 
5.8%
21954
 
5.6%
71954
 
5.6%
41652
 
4.8%
31620
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII34639
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
07981
23.0%
.6035
17.4%
15786
16.7%
62024
 
5.8%
92016
 
5.8%
82000
 
5.8%
21954
 
5.6%
71954
 
5.6%
41652
 
4.8%
31620
 
4.7%

FechaTransaccion
Categorical

HIGH CARDINALITY

Distinct2155
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size1.3 MiB
2015-01-21 12:00:00.0000000
 
256
2016-05-04 12:00:00.0000000
 
254
2016-03-23 12:00:00.0000000
 
249
2015-11-03 12:00:00.0000000
 
247
2015-04-14 12:00:00.0000000
 
247
Other values (2150)
172406 

Length

Max length27
Median length27
Mean length21.07998434
Min length11

Characters and Unicode

Total characters3660729
Distinct characters37
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowApr 30,2015
2nd rowApr 13,2015
3rd rowMay 13,2016
4th rowApr 02,2015
5th rowApr 01,2015

Common Values

ValueCountFrequency (%)
2015-01-21 12:00:00.0000000256
 
0.1%
2016-05-04 12:00:00.0000000254
 
0.1%
2016-03-23 12:00:00.0000000249
 
0.1%
2015-11-03 12:00:00.0000000247
 
0.1%
2015-04-14 12:00:00.0000000247
 
0.1%
2015-10-06 12:00:00.0000000246
 
0.1%
2015-10-19 12:00:00.0000000246
 
0.1%
2015-11-24 12:00:00.0000000245
 
0.1%
2016-05-19 12:00:00.0000000241
 
0.1%
2015-02-27 12:00:00.0000000241
 
0.1%
Other values (2145)171187
98.6%

Length

2024-06-17T18:18:41.138748image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
12:00:00.0000000105572
30.4%
may6952
 
2.0%
apr6923
 
2.0%
jan6714
 
1.9%
mar6450
 
1.9%
feb5761
 
1.7%
jul5105
 
1.5%
oct4726
 
1.4%
jun4649
 
1.3%
dec4601
 
1.3%
Other values (856)189865
54.7%

Most occurring characters

ValueCountFrequency (%)
01542681
42.1%
1397130
 
10.8%
2371414
 
10.1%
-218810
 
6.0%
:218810
 
6.0%
173659
 
4.7%
.109405
 
3.0%
5103606
 
2.8%
497684
 
2.7%
,64254
 
1.8%
Other values (27)363276
 
9.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2683029
73.3%
Other Punctuation392469
 
10.7%
Dash Punctuation218810
 
6.0%
Space Separator173659
 
4.7%
Lowercase Letter128508
 
3.5%
Uppercase Letter64254
 
1.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a20116
15.7%
e14719
11.5%
u13717
10.7%
r13373
10.4%
n11363
8.8%
p11280
8.8%
c9327
7.3%
y6952
 
5.4%
b5761
 
4.5%
l5105
 
4.0%
Other values (4)16795
13.1%
Decimal Number
ValueCountFrequency (%)
01542681
57.5%
1397130
 
14.8%
2371414
 
13.8%
5103606
 
3.9%
497684
 
3.6%
655130
 
2.1%
338051
 
1.4%
728216
 
1.1%
925214
 
0.9%
823903
 
0.9%
Uppercase Letter
ValueCountFrequency (%)
J16468
25.6%
M13402
20.9%
A10886
16.9%
F5761
 
9.0%
O4726
 
7.4%
D4601
 
7.2%
S4357
 
6.8%
N4053
 
6.3%
Other Punctuation
ValueCountFrequency (%)
:218810
55.8%
.109405
27.9%
,64254
 
16.4%
Dash Punctuation
ValueCountFrequency (%)
-218810
100.0%
Space Separator
ValueCountFrequency (%)
173659
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common3467967
94.7%
Latin192762
 
5.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a20116
 
10.4%
J16468
 
8.5%
e14719
 
7.6%
u13717
 
7.1%
M13402
 
7.0%
r13373
 
6.9%
n11363
 
5.9%
p11280
 
5.9%
A10886
 
5.6%
c9327
 
4.8%
Other values (12)58111
30.1%
Common
ValueCountFrequency (%)
01542681
44.5%
1397130
 
11.5%
2371414
 
10.7%
-218810
 
6.3%
:218810
 
6.3%
173659
 
5.0%
.109405
 
3.2%
5103606
 
3.0%
497684
 
2.8%
,64254
 
1.9%
Other values (5)170514
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII3660729
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01542681
42.1%
1397130
 
10.8%
2371414
 
10.1%
-218810
 
6.0%
:218810
 
6.0%
173659
 
4.7%
.109405
 
3.0%
5103606
 
2.8%
497684
 
2.7%
,64254
 
1.8%
Other values (27)363276
 
9.9%

Cantidad
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3383
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean716.7980813
Minimum-360
Maximum67368
Zeros4
Zeros (%)< 0.1%
Negative167606
Negative (%)96.5%
Memory size1.3 MiB
2024-06-17T18:18:41.254047image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-360
5-th percentile-150
Q1-60
median-9
Q3-5
95-th percentile-1
Maximum67368
Range67728
Interquartile range (IQR)55

Descriptive statistics

Standard deviation4718.727053
Coefficient of variation (CV)6.583063176
Kurtosis61.82103881
Mean716.7980813
Median Absolute Deviation (MAD)7
Skewness7.409037017
Sum124478438
Variance22266385.01
MonotonicityNot monotonic
2024-06-17T18:18:41.374678image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-1011555
 
6.7%
-59456
 
5.4%
-89426
 
5.4%
-29388
 
5.4%
-79333
 
5.4%
-19299
 
5.4%
-39184
 
5.3%
-69152
 
5.3%
-49073
 
5.2%
-99051
 
5.2%
Other values (3373)78742
45.3%
ValueCountFrequency (%)
-360143
 
0.1%
-324159
 
0.1%
-288141
 
0.1%
-26095
 
0.1%
-252154
 
0.1%
-250878
0.5%
-240794
0.5%
-23473
 
< 0.1%
-225884
0.5%
-216908
0.5%
ValueCountFrequency (%)
673681
< 0.1%
672721
< 0.1%
672001
< 0.1%
668401
< 0.1%
667441
< 0.1%
666961
< 0.1%
664801
< 0.1%
662881
< 0.1%
659041
< 0.1%
657601
< 0.1%

FechaTransaccionAjustada
Unsupported

REJECTED
UNSUPPORTED

Missing0
Missing (%)0.0%
Memory size1.3 MiB

NombreProducto
Categorical

HIGH CARDINALITY

Distinct227
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.3 MiB
"The Gu" red shirt XML tag t-shirt (White) M
 
1274
"The Gu" red shirt XML tag t-shirt (Black) XL
 
1254
Shipping carton (Brown) 305x305x305mm
 
1150
"The Gu" red shirt XML tag t-shirt (White) 5XL
 
1110
Black and orange glass with care despatch tape 48mmx75m
 
1109
Other values (222)
167762 

Length

Max length85
Median length63
Mean length41.89720084
Min length20

Characters and Unicode

Total characters7275826
Distinct characters72
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHalloween skull mask (Gray) L
2nd rowHalloween skull mask (Gray) L
3rd rowHalloween skull mask (Gray) L
4th rowHalloween skull mask (Gray) L
5th rowHalloween skull mask (Gray) L

Common Values

ValueCountFrequency (%)
"The Gu" red shirt XML tag t-shirt (White) M1274
 
0.7%
"The Gu" red shirt XML tag t-shirt (Black) XL1254
 
0.7%
Shipping carton (Brown) 305x305x305mm1150
 
0.7%
"The Gu" red shirt XML tag t-shirt (White) 5XL1110
 
0.6%
Black and orange glass with care despatch tape 48mmx75m1109
 
0.6%
"The Gu" red shirt XML tag t-shirt (White) XXS1101
 
0.6%
Tape dispenser (Red)1096
 
0.6%
"The Gu" red shirt XML tag t-shirt (White) XS1089
 
0.6%
"The Gu" red shirt XML tag t-shirt (Black) 4XL1087
 
0.6%
Air cushion film 200mmx200mm 325m841
 
0.5%
Other values (217)162548
93.6%

Length

2024-06-17T18:18:41.511013image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
black50705
 
3.9%
43584
 
3.4%
white34345
 
2.7%
mug34265
 
2.6%
joke32698
 
2.5%
red30338
 
2.3%
the27022
 
2.1%
gu22368
 
1.7%
shirt22368
 
1.7%
xml22368
 
1.7%
Other values (279)975583
75.3%

Most occurring characters

ValueCountFrequency (%)
1128028
 
15.5%
e603207
 
8.3%
a371593
 
5.1%
r356443
 
4.9%
t338855
 
4.7%
i323671
 
4.4%
o285872
 
3.9%
l262299
 
3.6%
s233027
 
3.2%
m215519
 
3.0%
Other values (62)3157312
43.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter4788417
65.8%
Space Separator1128028
 
15.5%
Uppercase Letter605649
 
8.3%
Decimal Number323713
 
4.4%
Close Punctuation141413
 
1.9%
Open Punctuation141413
 
1.9%
Dash Punctuation72284
 
1.0%
Other Punctuation70299
 
1.0%
Math Symbol4610
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e603207
 
12.6%
a371593
 
7.8%
r356443
 
7.4%
t338855
 
7.1%
i323671
 
6.8%
o285872
 
6.0%
l262299
 
5.5%
s233027
 
4.9%
m215519
 
4.5%
h214532
 
4.5%
Other values (16)1583399
33.1%
Uppercase Letter
ValueCountFrequency (%)
B111310
18.4%
L68869
11.4%
X61472
10.1%
S46488
7.7%
D42767
 
7.1%
G37945
 
6.3%
M33740
 
5.6%
A33512
 
5.5%
W32841
 
5.4%
T31177
 
5.1%
Other values (12)105528
17.4%
Decimal Number
ValueCountFrequency (%)
079618
24.6%
156059
17.3%
247409
14.6%
546209
14.3%
326749
 
8.3%
421400
 
6.6%
715854
 
4.9%
815703
 
4.9%
910800
 
3.3%
63912
 
1.2%
Other Punctuation
ValueCountFrequency (%)
"44736
63.6%
/13942
 
19.8%
,3142
 
4.5%
.2297
 
3.3%
'1583
 
2.3%
?1550
 
2.2%
:1539
 
2.2%
1510
 
2.1%
Math Symbol
ValueCountFrequency (%)
+3084
66.9%
=1526
33.1%
Space Separator
ValueCountFrequency (%)
1128028
100.0%
Close Punctuation
ValueCountFrequency (%)
)141413
100.0%
Open Punctuation
ValueCountFrequency (%)
(141413
100.0%
Dash Punctuation
ValueCountFrequency (%)
-72284
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin5394066
74.1%
Common1881760
 
25.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e603207
 
11.2%
a371593
 
6.9%
r356443
 
6.6%
t338855
 
6.3%
i323671
 
6.0%
o285872
 
5.3%
l262299
 
4.9%
s233027
 
4.3%
m215519
 
4.0%
h214532
 
4.0%
Other values (38)2189048
40.6%
Common
ValueCountFrequency (%)
1128028
59.9%
)141413
 
7.5%
(141413
 
7.5%
079618
 
4.2%
-72284
 
3.8%
156059
 
3.0%
247409
 
2.5%
546209
 
2.5%
"44736
 
2.4%
326749
 
1.4%
Other values (14)97842
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII7274316
> 99.9%
Punctuation1510
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1128028
 
15.5%
e603207
 
8.3%
a371593
 
5.1%
r356443
 
4.9%
t338855
 
4.7%
i323671
 
4.4%
o285872
 
3.9%
l262299
 
3.6%
s233027
 
3.2%
m215519
 
3.0%
Other values (61)3155802
43.4%
Punctuation
ValueCountFrequency (%)
1510
100.0%

PrecioUnitario
Unsupported

REJECTED
UNSUPPORTED

Missing0
Missing (%)0.0%
Memory size1.3 MiB

PrecioRecomendado
Unsupported

REJECTED
UNSUPPORTED

Missing0
Missing (%)0.0%
Memory size1.3 MiB

ValorTotal_Con_PrecioUnitario
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3826
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12390.09444
Minimum0
Maximum1038400
Zeros4
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1.3 MiB
2024-06-17T18:18:41.630774image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile30
Q1100
median250
Q31050
95-th percentile5600
Maximum1038400
Range1038400
Interquartile range (IQR)950

Descriptive statistics

Standard deviation75509.20657
Coefficient of variation (CV)6.094320501
Kurtosis65.01696815
Mean12390.09444
Median Absolute Deviation (MAD)190
Skewness7.702427882
Sum2151651411
Variance5701640277
MonotonicityNot monotonic
2024-06-17T18:18:41.754769image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8643615
 
2.1%
913443
 
2.0%
2883437
 
2.0%
133364
 
1.9%
263328
 
1.9%
1043326
 
1.9%
393285
 
1.9%
653269
 
1.9%
963243
 
1.9%
783240
 
1.9%
Other values (3816)140109
80.7%
ValueCountFrequency (%)
04
 
< 0.1%
11
 
< 0.1%
31
 
< 0.1%
81
 
< 0.1%
10156
 
0.1%
133364
1.9%
16222
 
0.1%
20152
 
0.1%
251379
0.8%
263328
1.9%
ValueCountFrequency (%)
10384001
< 0.1%
10358401
< 0.1%
10352001
< 0.1%
10316801
< 0.1%
10310402
< 0.1%
10307201
< 0.1%
10275201
< 0.1%
10220801
< 0.1%
10214401
< 0.1%
10208001
< 0.1%

ValorTotal_Con_PrecioRecomendado
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3918
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18512.78642
Minimum0
Maximum1557600
Zeros4
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1.3 MiB
2024-06-17T18:18:41.884091image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile45
Q1148
median375
Q31548
95-th percentile8350
Maximum1557600
Range1557600
Interquartile range (IQR)1400

Descriptive statistics

Standard deviation113183.9247
Coefficient of variation (CV)6.113824364
Kurtosis65.22266078
Mean18512.78642
Median Absolute Deviation (MAD)285
Skewness7.717590752
Sum3214911977
Variance1.28106008 × 1010
MonotonicityNot monotonic
2024-06-17T18:18:42.000840image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12963509
 
2.0%
1443492
 
2.0%
1333443
 
2.0%
4323426
 
2.0%
2883367
 
1.9%
193364
 
1.9%
383328
 
1.9%
1523326
 
1.9%
573285
 
1.9%
953269
 
1.9%
Other values (3908)139850
80.5%
ValueCountFrequency (%)
04
 
< 0.1%
21
 
< 0.1%
41
 
< 0.1%
121
 
< 0.1%
1584
 
< 0.1%
193364
1.9%
2072
 
< 0.1%
24222
 
0.1%
25137
 
0.1%
3085
 
< 0.1%
ValueCountFrequency (%)
15576001
< 0.1%
15537601
< 0.1%
15528001
< 0.1%
15475201
< 0.1%
15465602
< 0.1%
15460801
< 0.1%
15412801
< 0.1%
15331201
< 0.1%
15321601
< 0.1%
15312001
< 0.1%

Año
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.3 MiB
2015
74451 
2014
68386 
2016
30524 
2013
 
298

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters694636
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2015
2nd row2015
3rd row2016
4th row2015
5th row2015

Common Values

ValueCountFrequency (%)
201574451
42.9%
201468386
39.4%
201630524
17.6%
2013298
 
0.2%

Length

2024-06-17T18:18:42.115887image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2024-06-17T18:18:42.217894image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
201574451
42.9%
201468386
39.4%
201630524
17.6%
2013298
 
0.2%

Most occurring characters

ValueCountFrequency (%)
2173659
25.0%
0173659
25.0%
1173659
25.0%
574451
10.7%
468386
 
9.8%
630524
 
4.4%
3298
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number694636
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2173659
25.0%
0173659
25.0%
1173659
25.0%
574451
10.7%
468386
 
9.8%
630524
 
4.4%
3298
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common694636
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2173659
25.0%
0173659
25.0%
1173659
25.0%
574451
10.7%
468386
 
9.8%
630524
 
4.4%
3298
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII694636
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2173659
25.0%
0173659
25.0%
1173659
25.0%
574451
10.7%
468386
 
9.8%
630524
 
4.4%
3298
 
< 0.1%

Cantidad_abs
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3341
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean792.353912
Minimum0
Maximum67368
Zeros4
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1.3 MiB
2024-06-17T18:18:42.482480image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q15
median10
Q370
95-th percentile225
Maximum67368
Range67368
Interquartile range (IQR)65

Descriptive statistics

Standard deviation4706.629273
Coefficient of variation (CV)5.940059363
Kurtosis62.01186818
Mean792.353912
Median Absolute Deviation (MAD)8
Skewness7.419282739
Sum137599388
Variance22152359.12
MonotonicityNot monotonic
2024-06-17T18:18:42.604840image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1011559
 
6.7%
59456
 
5.4%
89426
 
5.4%
29391
 
5.4%
79333
 
5.4%
19301
 
5.4%
39189
 
5.3%
69152
 
5.3%
49077
 
5.2%
99051
 
5.2%
Other values (3331)78724
45.3%
ValueCountFrequency (%)
04
 
< 0.1%
19301
5.4%
29391
5.4%
39189
5.3%
49077
5.2%
59456
5.4%
69152
5.3%
79333
5.4%
89426
5.4%
99051
5.2%
ValueCountFrequency (%)
673681
< 0.1%
672721
< 0.1%
672001
< 0.1%
668401
< 0.1%
667441
< 0.1%
666961
< 0.1%
664801
< 0.1%
662881
< 0.1%
659041
< 0.1%
657601
< 0.1%

Interactions

2024-06-17T18:18:36.051324image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:24.093311image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:26.267058image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:28.036333image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:29.760285image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:31.464691image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:32.968848image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:34.492943image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:36.221653image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:24.392969image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:26.497371image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:28.254925image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:29.953049image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:31.659930image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:33.143536image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:34.669459image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:36.391355image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:24.678869image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:26.738195image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:28.466412image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:30.141896image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:31.866351image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:33.322543image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:34.877052image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:36.575811image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:24.929000image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:26.984544image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:28.714062image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:30.334330image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:32.061166image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:33.494393image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:35.086832image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:36.787498image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:25.265117image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:27.211711image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:28.925433image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:30.536792image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:32.268012image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:33.672862image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:35.291786image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:37.021439image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:25.533649image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:27.414693image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:29.120644image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:30.738551image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:32.460240image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:33.869592image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:35.497304image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:37.259026image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:25.765490image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:27.613155image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:29.335518image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:30.934609image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:32.624160image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:34.069603image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:35.676091image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:37.640053image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:25.996489image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:27.821825image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:29.559560image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:31.249079image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:32.797505image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:34.290583image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-06-17T18:18:35.863530image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2024-06-17T18:18:42.727883image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2024-06-17T18:18:42.888769image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2024-06-17T18:18:43.056798image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2024-06-17T18:18:43.212052image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2024-06-17T18:18:43.315481image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2024-06-17T18:18:37.997694image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2024-06-17T18:18:38.622907image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

TransaccionProductoIDProductoIDTipoTransaccionIDClienteIDInvoiceIDProveedorIDOrdenDeCompraIDFechaTransaccionCantidadFechaTransaccionAjustadaNombreProductoPrecioUnitarioPrecioRecomendadoValorTotal_Con_PrecioUnitarioValorTotal_Con_PrecioRecomendadoAñoCantidad_abs
02211111481013.046305.0Apr 30,2015-96.02015-04-30Halloween skull mask (Gray) L18271728.02592.0201596.0
121534914810562.045092.0Apr 13,2015-96.02015-04-13Halloween skull mask (Gray) L18271728.02592.0201596.0
2330727148101048.069359.0May 13,2016-24.02016-05-13Halloween skull mask (Gray) L1827432.0648.0201624.0
321211514810826.044413.0Apr 02,2015-12.02015-04-02Halloween skull mask (Gray) L1827216.0324.0201512.0
421164114810841.044310.0Apr 01,2015-72.02015-04-01Halloween skull mask (Gray) L18271296.01944.0201572.0
510321814810952.021631.0Feb 22,2014-72.02014-02-22Halloween skull mask (Gray) L18271296.01944.0201472.0
623159614810591.048500.0Jun 04,2015-84.02015-06-04Halloween skull mask (Gray) L18271512.02268.0201584.0
720212314810528.042324.0Feb 26,2015-12.02015-02-26Halloween skull mask (Gray) L1827216.0324.0201512.0
89937114810461.020817.0Feb 07,2014-108.02014-02-07Halloween skull mask (Gray) L18271944.02916.02014108.0
921483314810127.044986.0Apr 10,2015-108.02015-04-10Halloween skull mask (Gray) L18271944.02916.02015108.0

Last rows

TransaccionProductoIDProductoIDTipoTransaccionIDClienteIDInvoiceIDProveedorIDOrdenDeCompraIDFechaTransaccionCantidadFechaTransaccionAjustadaNombreProductoPrecioUnitarioPrecioRecomendadoValorTotal_Con_PrecioUnitarioValorTotal_Con_PrecioRecomendadoAñoCantidad_abs
1736492534398910157.053094.02015-08-17 12:00:00.0000000-108.02015-08-17"The Gu" red shirt XML tag t-shirt (Black) 3XS18271944.02916.02015108.0
173650280884891085.058853.02015-11-19 12:00:00.0000000-108.02015-11-19"The Gu" red shirt XML tag t-shirt (Black) 3XS18271944.02916.02015108.0
1736513333698910402.069897.02016-05-21 12:00:00.0000000-108.02016-05-21"The Gu" red shirt XML tag t-shirt (Black) 3XS18271944.02916.02016108.0
1736522349698910439.049195.02015-06-16 12:00:00.0000000-108.02015-06-16"The Gu" red shirt XML tag t-shirt (Black) 3XS18271944.02916.02015108.0
1736531818838910949.038078.02014-12-16 12:00:00.0000000-108.02014-12-16"The Gu" red shirt XML tag t-shirt (Black) 3XS18271944.02916.02014108.0
173654116140891073.024308.02014-04-16 12:00:00.0000000-108.02014-04-16"The Gu" red shirt XML tag t-shirt (Black) 3XS18271944.02916.02014108.0
1736552640958910845.055349.02015-09-23 12:00:00.0000000-108.02015-09-23"The Gu" red shirt XML tag t-shirt (Black) 3XS18271944.02916.02015108.0
1736561075298910542.022518.02014-03-12 12:00:00.0000000-108.02014-03-12"The Gu" red shirt XML tag t-shirt (Black) 3XS18271944.02916.02014108.0
1736571236178910945.025882.02014-05-13 12:00:00.0000000-108.02014-05-13"The Gu" red shirt XML tag t-shirt (Black) 3XS18271944.02916.02014108.0
1736582904468910459.060854.02015-12-24 12:00:00.0000000-108.02015-12-24"The Gu" red shirt XML tag t-shirt (Black) 3XS18271944.02916.02015108.0